Space Efficient Quantile Summary for Constrained Sliding Windows on a Data Stream
نویسندگان
چکیده
In many online applications, we need to maintain quantile statistics for a sliding window on a data stream. The sliding windows in natural form are defined as the most recent N data items. In this paper, we study the problem of estimating quantiles over other types of sliding windows. We present a uniform framework to process quantile queries for time constrained and filter based sliding windows. Our algorithm makes one pass on the data stream and maintains an -approximate summary. It uses O( 1 2 log N) space where N is the number of data items in the window. We extend this framework to further process generalized constrained sliding window queries and proved that our technique is applicable for flexible window settings. Our performance study indicates that the space required in practice is much less than the given theoretical bound and the algorithm supports high speed data streams.
منابع مشابه
Fast and Space-Efficient Computation of Equi-Depth Histograms for Data Streams
Equi-depth histograms represent a fundamental synopsis widely used in both database and data stream applications, as they provide the cornerstone of many techniques such as query optimization, approximate query answering, distribution fitting, and parallel database partitioning. Equi-depth histograms try to partition a sequence of data in a way that every part has the same number of data items....
متن کاملMining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows
Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...
متن کاملEfficient Approximation of Correlated Sums on Data Streams
In many applications such as IP network management, data arrives in streams, and queries over those streams need to be processed online using limited storage. Correlated-sum (CS) aggregates are a natural class of queries formed by composing basic aggregates on (x, y) pairs, and are of the form SUM{g(y) : x ≤ f(AGG(x))}, where AGG(x) can be any basic aggregate and f(), g() are user-specified fun...
متن کاملLectures on Streaming Algorithms
• Standard stream model: m elements from universe of size n, come one by one. Goal: compute a function of stream. Constraints: (1) Limited space (working memory), sublinear in n and m. (2) Access data sequentially. (3) Process each element quickly. • Graph stream: elements of the stream are edges. Let n be the number of vertices. We have m = O(n 2). To be able to handle interesting functions, w...
متن کاملResearch on Sliding Window Join Semantics and Join Algorithm in Heterogeneous Data Streams
Sliding windows of data stream have rich semantics, which results all kinds of window semantics of different data stream, so join semantics between the different types of windows becomes very complicated. The basic join semantic of data streams, the join semantic of tuple-based sliding window and the join semantic of time-based sliding window have partly solved the semantics of stream joins, bu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004